AITopics | main result

Collaborating Authors

main result

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Optimal Asymptotic Rates for (Stochastic) Gradient Descent under the Local PL-Condition: A Geometric Approach

Kassing, Sebastian, Kruse, Thomas

arXiv.org Machine LearningMay-15-2026

Stochastic gradient descent (SGD) has been studied extensively over the past decades due to its simplicity and broad applicability in machine learning. In this work, we analyze the local behavior of gradient descent and stochastic gradient descent for minimizing $C^2$-functions that satisfy the Polyak-Lojasiewicz (PL) inequality and under a multiplicative gradient noise model motivated by overparameterized neural networks. Using a geometric interpretation of the PL-condition, we prove a simple yet surprising fact: in this possibly non-convex setting, the asymptotic convergence rate of (S)GD matches the rate obtained for strongly convex quadratics.

artificial intelligence, machine learning, projn, (17 more...)

arXiv.org Machine Learning

2605.14663

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Universality in Deep Neural Networks: An approach via the Lindeberg exchange principle

Giovagnini, Filippo, Kotitsas, Sotirios, Romito, Marco

arXiv.org Machine LearningMay-5-2026

We consider the infinite-width limit of a fully connected deep neural network with general weights, and we prove quantitative general bounds on the $2$-Wasserstein distance between the network and its infinite-width Gaussian limit, under appropriate regularity assumptions on the activation function. Our main tool is a Lindeberg principle for Deep Neural Networks, which we use to successively replace the weights on each layer by Gaussian random variables.

artificial intelligence, machine learning, neural network, (19 more...)

arXiv.org Machine Learning

2605.02771

Country: North America > United States > New York (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Decentralized Machine Learning with Centralized Performance Guarantees via Gibbs Algorithms

Bermudez, Yaiza, Perlaza, Samir, Esnaola, Iñaki

arXiv.org Machine LearningApr-23-2026

In this paper, it is shown, for the first time, that centralized performance is achievable in decentralized learning without sharing the local datasets. Specifically, when clients adopt an empirical risk minimization with relative-entropy regularization (ERM-RER) learning framework and a forward-backward communication between clients is established, it suffices to share the locally obtained Gibbs measures to achieve the same performance as that of a centralized ERM-RER with access to all the datasets. The core idea is that the Gibbs measure produced by client~$k$ is used, as reference measure, by client~$k+1$. This effectively establishes a principled way to encode prior information through a reference measure. In particular, achieving centralized performance in the decentralized setting requires a specific scaling of the regularization factors with the local sample sizes. Overall, this result opens the door to novel decentralized learning paradigms that shift the collaboration strategy from sharing data to sharing the local inductive bias via the reference measures over the set of models.

artificial intelligence, machine learning, probability measure, (16 more...)

arXiv.org Machine Learning

2604.20492

Country:

Europe > Austria > Vienna (0.14)
Europe > France (0.05)
Oceania > French Polynesia (0.04)
(10 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

82096f4f6f897529ecd3eabea603e9cc-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 13:59:49 GMT

It is well known that this is unlearnable, but we sketch a proof here.

artificial intelligence, machine learning, probability, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

The Impact of Regularization on High-dimensional Logistic Regression

Fariborz Salehi, Ehsan Abbasi, Babak Hassibi

Neural Information Processing SystemsFeb-13-2026, 12:54:50 GMT

Logistic regression is commonly used for modeling dichotomous outcomes.

artificial intelligence, machine learning, regression, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Pasadena (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report > New Finding (0.38)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.38)

Add feedback

Modifications to a neural network's input and output layers are often required to accommodate the specificities of most practical learning tasks.

artificial intelligence, isdenseinc, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > New Jersey > Bergen County > Hackensack (0.04)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learning with little mixing

Neural Information Processing SystemsFeb-7-2026, 19:25:40 GMT

We study square loss in a realizable time-series framework with martingale difference noise. Our main result is a fast rate excess risk bound which shows that whenever a trajectory hypercontractivity condition holds, the risk of the leastsquares estimator on dependent data matches the iid rate order-wise after a burn-in time. In comparison, many existing results in learning from dependent data have rates where the effective sample size is deflated by a factor of the mixing-time of the underlying process, even after the burn-in time. Furthermore, our results allow the covariate process to exhibit long range correlations which are substantially weaker than geometric ergodicity.

artificial intelligence, burn-in time, machine learning, (16 more...)

Neural Information Processing Systems

Country: